gh-138232: Improve performance of dataclasses.asdict by caching dataclass field names#138233
gh-138232: Improve performance of dataclasses.asdict by caching dataclass field names#138233eendebakpt wants to merge 10 commits intopython:mainfrom
Conversation
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
| # order, so the order of the tuple is as the fields were defined. | ||
| return tuple(f for f in fields.values() if f._field_type is _FIELD) | ||
|
|
||
| def _field_names(class_or_instance): |
There was a problem hiding this comment.
Inlining this method gives a bit of performance gain:
%timeit _field_names(s)
%timeit s.__dataclass_field_names__
Results in
41.8 ns ± 1.59 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
17.8 ns ± 0.188 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)
picnixz
left a comment
There was a problem hiding this comment.
More generally, I wonder whether many small functions would actually benefit from a C implementation as they seem to target micro-optimizations. I don't know which ones we could optimize and I also don't know if it's worth it but had we ever considered doing this?
Now, I think it's a good trade-off to increase memory per dataclass type but reduce quite drastically the time it takes to do asdict transformations.
|
|
||
| Accepts a dataclass or an instance of one. Excludes pseudo-fields. | ||
| """ | ||
|
|
There was a problem hiding this comment.
My suggestion here is to remove the blank line after """. And I however think that it would be better to actually inline the function but considering we also use this approach for _PARAMS, it would be more consistent to keep it.
|
I addressed the review comments. In the last commit I inlined the |
| # also marks this class as being a dataclass. | ||
| setattr(cls, _FIELDS, fields) | ||
| # Store field names. Excludes pseudo-fields. | ||
| setattr(cls, _FIELD_NAMES, tuple(f.name for f in fields.values() |
There was a problem hiding this comment.
If you want, you can also do cls.__dataclass_field_names__ = ... directly (I would also advise you to consider this as part of the "inline" commit, so you can also force-push this one).
An other alternative is to only inline getattr(x, _FIELDS_NAMES) instead of having a standalone function for that. It could also improve readability a bit and justify the needs of the global _FIELD_NAMES.
IOW, choose among the following:
- Hardcode
__dataclass_field_names__everywhere, without having using getattr/setattr and_FIELD_NAMES. - Use
_FIELD_NAMES+getattr/setattr(..., _FIELD_NAMES)directly. - Use
_FIELD_NAMES+ a module-wide function defined to be the partialization_get_names(x) ~ getattr(x, _FIELD_NAMES).
My preference is (1) or (2) but (3) is an overkill IMO.
There was a problem hiding this comment.
Ok, going for option (1)
| # also marks this class as being a dataclass. | ||
| setattr(cls, _FIELDS, fields) | ||
| # Store field names. Excludes pseudo-fields. | ||
| cls.__dataclass_field_names__ = tuple(f.name for f in fields.values() |
There was a problem hiding this comment.
hi! no one asked me for review, but I'd like to put in my two cents :)
I believe this could be faster by moving from tuple(genexp) to tuple([listcomp])
this is definitely not the place where the most computational time spent for creating a dataclass, but anyways :)
Benchmark results with PGO on Linux:
(script in issue)